Fuzzy curve analysis based soft sensor modeling method using time difference Gaussian process regression

ABSTRACT

The invention provides a fuzzy curve analysis based soft sensor modeling method using time difference Gaussian process regression, it is suitable for application in chemical process with time delay characteristics. This method can extract stable delay information from the historical database of process and introduce more relevant modeling data sequence to the dominant variable sequence. First of all, the method of fuzzy curve analysis (FCA) can intuitively judge the importance of the input sequence to the output sequence, estimate the time-delay parameters of process, and such offline time-delay parameter set can be utilized to restructure the modeling data. For the new input data, based on the historical variable value before a certain time, the current dominant value can be predicted by time difference Gaussian Process Regression (TDGPR) model. This method does not encounter the problem of model updating and can effectively track the drift between input and output data. Compared with steady-state modeling methods, this invention can achieve more accurate predictions of the key variable, thus improving product quality and reducing production costs.

CROSS-REFERENCES AND RELATED APPLICATIONS

This application claims the benefit of priority to Chinese Application No. 201510541727.5, entitled “A fuzzy curve analysis based soft sensor modeling method using time difference Gaussian process regression”, filed on Aug. 28, 2015, which is herein incorporated by reference in its entirety.

BACKGROUND OF THE INVENTION

Field of the Invention

The invention relates to a fuzzy curve analysis based time difference Gaussian process regression soft sensor modeling method (FCA-TDGPR), and belongs to the field of complex industrial process modeling and soft sensing.

Description of the Related Art

The traditional soft sensor modeling methods mostly consider the characteristics of zero delay, that is, considering the input and output with the same sampling interval and input variable collected at time t corresponds to the t-th dominant variable sample in the database. However, there is a significant time lag between the input data collected by each sensor and the output data obtained through the laboratory analysis or online instrumentation. If we continue using steady-state modeling approaches, the established model will be unable to fully explain the characteristics of the process, and it does not meet the causality of actual process. In order to ensure that the soft sensor model can achieve accurate predictions of the key variables in a long time, it is necessary to take measures to introduce the time delay and dynamic information.

Essentially speaking, most of the existing time delay estimation methods are trying to find the auxiliary variables which are most closely related to the dominant variables for modeling. When applying such methods to practical applications, it is needed to get a tradeoff between the complexity and the accuracy of the algorithm. In view of the process time delay estimation problem, at present, the existing methods include mutual information (MI) method, correlation coefficient method, etc. The invention adopts fuzzy curve analysis method to introduce variable time delay information into the soft sensor model, and the characteristic of this method is low computational complexity while being easy to understand. It is possible to visually and effectively determine the importance degree of the input variables.

The performance of soft sensor model needs to be maintained by periodic reconstruction in order to dynamically track and effectively take control of the process. The main methods include Moving Window (MW), iterative approach and Just-in-time Learning (JITL), but these methods often need to update the model frequently. Time difference model not only can deal with the problem of model performance degradation due to time lapse and achieve the effect of tracking process dynamics exactly as model updating it also is able to minimize the likelihood of model reconstruction.

Up to now, data-driven based soft sensor modeling methods have received more and more attention. Some commonly used methods like partial least squares (PLS) and principal component analysis (PCA). They can describe well the linear relationship between input variables and output variables. And artificial neural networks (ANN), support vector machine (SVM) and least squares support vector machine (LS-SVM) can effectively deal with the nonlinear characteristics of the process. In recent years, as a non-parametric probabilistic model, GPR not only can give the predictive value, but also can get the uncertain degree of predictive value. Therefore, the GPR model is selected as the basic soft sensor modeling algorithm in the present invention, and combined with the TD modeling strategy to effectively deal with the drift between input and output data in the process.

In summary, considering the establishment of a time delay online soft sensor model for the strict control of key variables in the process is of great significance.

DETAILED DESCRIPTION

For time delay process, the soft sensor model without delay can't be modeled by the sequence of auxiliary variables which are the most relevant to the dominant variable. When such a model is employed to do estimation, the estimation accuracy of the dominant variable will be greatly affected. In order to effectively extract the process delay information and set up an online soft sensor model without the case of frequent model updating, it is necessary to provide a more effective online strategy for the prediction of key variable. Therefore, this invention provides a soft sensor modeling method based on FCA-TDGPR.

The invention is realized by the following technical scheme:

Soft sensor modeling method based on FCA-TDGPR which comprises the following steps: aiming at the time delay process, first of all, collect enough samples of data and constitute the historical database with the historical values of auxiliary variables and dominant variable.

The FCA method is used to determine the time delay parameters of each auxiliary variable with respect to the dominant variable sequence, which is used to reconstruct the soft sensor modeling data;

Then, When new input samples are available, a time difference Gaussian process regression (TDGPR) model is employed for current time online prediction based on historical variable values collected certain moments ago, thus making it possible to realize real-time estimation and control of the key variable, obtain more accurate results, increase the yield and reduce production costs.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a flow chart of online soft sensor method based on FCA-TDGPR;

FIG. 2 is a schematic diagram of debutanizer process;

FIG. 3 is a schematic diagram of TDGPR modeling approach;

FIG. 4 contains fuzzy curve distribution diagrams of the original variables and optimal time delay variables;

FIG. 5 contains scatterplots of butane concentration predictions with different j values.

EXAMPLES

The modeling flow chart, which is shown in FIG. 1 below, is further detailed in the present invention:

Take the actual chemical process as an example, debutanizer is an important part of naphtha desulfurization and separation device of oil refining production process, and one of the dominant variables needed to be controlled for this process is the concentration of the bottom butane (C4). The schematic diagram of the process is shown in FIG. 2, due to the value of C4 cannot be directly measured, therefore, there is a delay issue in analyzing and obtaining C4 concentration values. At the same time, different auxiliary variables show different degrees of time delay. Experimental data is derived from the actual industrial process which contains 2394 samples, a total of 7 auxiliary variables. As shown in FIG. 2, x₁ is the top temperature; x₂ is the top pressure; x₃ is the reflux flow; x₄ is the top product outflow; x₅ is the 6th tray temperature; x₆ is the bottom temperature 1; x₇ is the bottom temperature 2. The 1 dominant variable is the bottom butane concentration, which in this invention is predicted as the key variable of the process.

Step1: Collect historical input and output data to form a training database which contains N continuous samples. Assuming that the data is expressed as {X(t),y(t)},t 1, 2, . . . , N and is preprocessed, and the 2 bottom temperature variables are averaged as 1 auxiliary variable, then, X(t)=[x₁(t), x₂(t), x₃(t), x₄(t), x₅(t), x₆(t)]^(T). The maximum time delay T_(max) of 6 variables is set to 19.

Step2: For each of the original variables x_(i), i∈{1, 2, . . . , 6}, they are extended to the input variables with time delay {x_(i)(t−λ), λ=0, 1, . . . , T_(max)} by formula (1), and a set of 120 dimensional delay variables will be obtained for subsequent analysis

Step3: Determine the importance of each variable in the time delay input variable set by FCA, for (x_(i)(t),y(t)), fuzzy membership function of variable x_(i) is defined as:

$\begin{matrix} {{\Phi_{it}\left( x_{i} \right)} = {\exp \left\lbrack {- \left( \frac{{x_{i}(t)} - x_{i}}{b} \right)^{2}} \right\rbrack}} & (2) \end{matrix}$

For each x_(i), {Φ_(it), y(t)} provides a fuzzy rule which is described as {if x_(i) is Φ_(it)(x_(i)), then y is y(t)}, and Φ_(it) is a fuzzy membership function of input variable x_(i) at t-th data point; In formula (2) a Gaussian fuzzy membership function is selected; b is determined as 20% the range of variable x_(i). As a result, For N training samples, each sample corresponding to each variable has N fuzzy rules. In the fuzzy membership function, Φ_(it)=1 holds true at each point {x_(i)(t),y(t)}.

For time delay process, by introducing time delay information the original variable x_(i) becomes (T_(max)+1)-dimensional, which can be expressed as x_(i)(t−λ), λ=0, 1, . . . , T_(max), λ is a variable delay value to be introduced; fuzzy curve C_(i,λ) with the condition that λ is the i-th variable delay value can be obtained by making centroid defuzzification of each new expanded variable using formula (3); as shown in the formula (4), d_(i) is the λ which can make the maximum coverage of fuzzy curve C_(i,λ); C_(i,λ)(λ)_(max) is the maximum value of the fuzzy curve point range, while C_(i,λ)(λ)_(min) is the minimum value of the fuzzy curve point range;

$\begin{matrix} {{C_{i,\lambda}(\lambda)} = \frac{\sum_{t = 1}^{N}{{\Phi_{it}\left\lbrack {x_{i}\left( {t - \lambda} \right)} \right\rbrack} \cdot {y(t)}}}{\sum_{t = 1}^{N}{\Phi_{it}\left\lbrack {x_{i}\left( {t - \lambda} \right)} \right\rbrack}}} & (3) \\ {d_{i} = {\underset{\lambda}{argmax}\left\lbrack {{C_{i,\lambda}(\lambda)}_{\max} - {C_{i,\lambda}(\lambda)}_{\min}} \right\rbrack}} & (4) \end{matrix}$

If the scope of the C_(i,λ)(λ) range is closer to that of y, then the input variable x_(i)(t−λ) is more important. In view of this point, the importance degree of each variable can be determined by sorting the coverage of C_(i,λ)(λ). Finally, the optimal delay parameter d_(i) as well as time delay variable x_(i)(t−d_(i)) can thus be obtained by FCA method, which later on can be used for soft sensor modeling data reconstruction.

Step4: Based on the previous step, the time delay parameters d₁, d₂, . . . , d_(m) are used to reconstruct the training input sample set for on-line modeling, the reconstructed input dataset is denoted as X_(d)(t), X_(d)(t)[x₁(t−d₁), x₂(t−d₂), x₃(t−d₃), x₄(t−d₄), x₅(t−d₅), x₆(t−d₆)]. If there is a new input sample X(t+1), then the delay input set could be restructured based on historical database samples with the same parameters, then go to step 5, otherwise, wait for the arrival of new data.

Step 5: After the reorganization procedure, the training set and the new data are processed by j order time difference treatment (the value of j can be determined according to the sampling period and property of dominant variable):

ΔX _(d,j)(t)X _(d)(t)−X _(d)(t−j)

Δy _(j)(t)=y(t)−y(t−j)  (5)

Next, make a regression of the relationship between ΔX_(d,j)(t) and Δy_(j)(t) by GPR, which satisfies Δ(t)=f(ΔX_(j)(t))+e(t). The GPR method can obtain the mapping relationship through the given training input and output samples. In this way, the corresponding predictive value and the uncertainty degree can be obtained given the new input data, which means the result will be probabilistic. The GPR algorithm is shown as below.

In general, the relationship between the observed output value y and noise e satisfies:

y _(i) =f(x _(i))+ε

ε˜N(0,σ_(n) ²)  (6)

If the mean function and covariance function are determined, then the distribution of the Gaussian process is well-determined. For simplicity, the mean function is usually preprocessed into 0. Covariance function can transform the correlation of output data into the function of input data. As similar inputs produce similar outputs, the covariance function can be selected according to the characteristics of the sample distribution. One condition which must be satisfied is that the closer the distance of samples is, the more correlated the two samples are, and vise versa. The covariance function form of this invention is shown in formula (7):

$\begin{matrix} {{k\left( {x_{p},x_{q}} \right)} = {v\; {\exp \left\lbrack {{- \frac{1}{2}}{\sum\limits_{d = 1}^{D}\; {\pi_{d}\left( {x_{p}^{d} - x_{q}^{d}} \right)}^{2}}} \right\rbrack}}} & (7) \end{matrix}$

In the formula, x_(p), x_(q)∈R^(D), v controls the magnitude of the covariance function. π_(d) describes the relative importance of each input attribute x^(d). The determination of the hyper-parameter Θ_(gp)=(v, π₁, . . . , π_(D), σ_(R) ²) in the Gaussian process is generally estimated by the MLE method. The optimization of the parameters can be realized by using the conjugate gradient method. Based on test sample and training data, the posterior distribution of test data x_(*) can be calculated, and its predictive value obey the joint Gaussian distribution described in formula (9), where K(X,X) is n-dimensional covariance matrix of training samples; k(x_(*),X) is the covariance vector of test sample and training samples; k(x_(*),x_(*)) is the autocovariance of test sample, and f_(gp) is a predictive value of GPR.

$\begin{matrix} {{L\left( \Theta_{gp} \right)} = {{{- \frac{1}{2}}{y^{T}\left\lbrack {{K\left( {X,X} \right)} + {\sigma_{n}^{2}I}} \right\rbrack}^{- 1}y} - {\frac{1}{2}\log \; {\det \left\lbrack {{K\left( {X,X} \right)} + {\sigma_{n}^{2}I}} \right\rbrack}} - {\frac{n}{2}\log \; 2\pi}}} & (8) \\ {\mspace{79mu} {{{f_{gp}X},y,{\left. x_{*} \right.\sim{N\left( {\overset{\_}{f_{gp}},{{cov}\left( f_{gp} \right)}} \right)}}}\; \mspace{20mu} {{s.t.\mspace{14mu} \overset{\_}{f_{gp}}} = {{{k\left( {x_{*},X} \right)}\left\lbrack {{K\left( {X,X} \right)} + {\sigma_{n}^{2}I_{n}}} \right\rbrack}^{- 1}y}}\mspace{20mu} {{{cov}\left( f_{gp} \right)} = {{k\left( {x_{*},x_{*}} \right)} - {{{k\left( {X,x_{*}} \right)}^{T}\left\lbrack {{K\left( {X,X} \right)} + {\sigma_{n}^{2}I_{n}}} \right\rbrack}^{- 1} \cdot {k\left( {X,x_{*}} \right)}}}}}} & (9) \end{matrix}$

when the new input data arrives at time t+1, the calculating formula of predictive value y_(j,pred)(t+1) with TDGPR method is:

ΔX(t+1)=X(t+1)−X(t+1−j)

Δy _(j,pred)(t+1)=f _(GPR)(ΔX _(j)(t+1))

y _(j,pred)(t+1)=y _(j)(t+1−j)+Δy _(j,pred)(t+1)  (10)

In the actual industrial process, there will be the case of instrument damage or laboratory analysis with delay, and the circumstance that the time interval of obtaining dominant variable is large and the quantity is small or there is a lack of dominant analysis value in the database. Thus, shown in FIG. 3, for new incoming test data X(t+1), based on y_(i)(t+1−j) stored in the database j moment ago, the predictive value of the dominant variable at time t+1 can be obtained. The predicted output y_(j,pred)(t+1) of the online model is calculated by formula (10), and the predicted result of bottom butane concentration can be obtained.

As shown in FIG. 4, compared the original variables without time delay, 6 reconstructed variables contribute more to the dominant variable, which introduce more relevant modeling data for online modeling. At the same time, in order to verify the effectiveness of this invention for on-line estimation, the first 1519 samples are selected in 2394 samples to reconstruct 1500 training samples. The final 875 samples are then used as test samples, and a soft sensor is established for on-line prediction of butane concentration.

FIG. 5 contains scatterplots of butane concentration prediction results with two methods respectively denoted as the FCA-TDGPR method (present method) which involves delay estimation, and t-TDGPR method which is without time delay estimation.

From FIG. 5, when the time difference order j increases from 1 to 10, the time interval of prediction based on historical database is gradually increasing, and the prediction accuracy is declining. This is because the more recent the analysis value is, the better tracking ability the model has for current process dynamics.

Although the accuracy of the two methods are in decline, compared with the TDGPR method without considering the time delay, the predicted results of the present invention can be better close to the true value of the butane concentration when the time difference increases. This suggests that extracted delay information is in line with the actual causal relationship of the process, and the soft sensor model with variable time delay estimation is more accurate.

After fuzzy curve analysis method is taken to determine the optimal parameters, reconstructed data is proved to be capable of enhancing the accuracy of online model significantly by introducing more contributing auxiliary variables to dominant variable sequence. At the same time, it reflects that the GPR method can explain the dynamic change of the process well. The online soft sensor model based on TDGPR method can adaptively estimate real-time butane concentration with historical variables collected j time ago.

FIGS. 4 and 5 have jointly validated that fuzzy curve analysis based time difference Gaussian process regression soft sensor modeling method has good accuracy for on-line prediction of bottom butane concentration.

While the present invention has been described in some detail for purposes of clarity and understanding, one skilled in the art will appreciate that various changes in form and detail can be made without departing from the true scope of the invention. All figures, tables, appendices, patents, patent applications and publications, referred to above, are hereby incorporated by reference. 

What is claimed is:
 1. A soft sensor modeling method of fuzzy curve analysis based time difference Gaussian process regression, wherein the method comprises the following steps: step 1: collecting input and output variables of a process, constructing a historical training database and obtaining N samples: {X(t), y(t)}, t=1, 2, . . . , N; and then preprocessing the data; according to a process mechanism and experience, determining a maximum time delay T_(max) existing in auxiliary variables; step2: for each original auxiliary variable x_(i), i∈{1, 2, . . . , m}, extending it to input variable set with time delay {x_(i)(t−λ), λ=0, 1, . . . , T_(max)}, wherein an extension mode is:

$\begin{matrix} {{\Phi_{it}\left( x_{i} \right)} = {\exp \left\lbrack {- \left( \frac{{x_{i}(t)} - x_{i}}{b} \right)^{2}} \right\rbrack}} & (2) \end{matrix}$ step 3: determining importance of each variable in time delay input variable set by fuzzy curve analysis and obtaining an optimal time delay variable x_(i)(t−d_(i)); wherein determining the importance comprises: wherein the input variable set is {x_(i), i=1, 2, . . . , m} and output variable is y, for input an variable x_(i), whose sample value collected at time t is denoted as x_(i)(t), for (x_(i)(t),y(t)), a fuzzy membership function of variable x_(i) is defined as: $\begin{matrix} {{{\Phi_{it}\left( x_{i} \right)} = {\exp \left\lbrack {- \left( \frac{{x_{i}(t)} - x_{i}}{b} \right)^{2}} \right\rbrack}},} & (2) \end{matrix}$ wherein for each x_(i), {Φ_(it), y(t)} provides a fuzzy rule which is described as {if x_(i) is Φ_(it)(x_(i)), then y is y(t)}, and Φ_(it) is a fuzzy membership function of input variable x_(i) at the t-th data point; wherein in formula (2), a Gaussian fuzzy membership function is selected; wherein b is determined as 20% the range of variable x_(i); wherein as a result, for N training samples, each sample corresponding to each variable has N fuzzy rules; in the fuzzy membership function, Φ_(it)=1 holds true at each point {x_(i)(t),y(t)}; wherein for time delay process, by introducing time delay information, the original variable x_(i) becomes (T_(max)+1)-dimensional, which can be expressed as x_(i)(t−λ), λ=0, 1, . . . , T_(max), λ is a variable delay value introduced; fuzzy curve C_(i,λ) with the condition that λ is the i-th variable delay value can be obtained by making centroid defuzzification of each new expanded variable with formula (3); wherein as shown in the formula (4), d_(i) is the λ which can make the maximum coverage of fuzzy curve C_(i,λ); wherein C_(i,λ)(λ)_(max) is the maximum value of the fuzzy curve point range, while C_(i,λ)(λ)_(min) is the minimum value of the fuzzy curve point range; $\begin{matrix} {{C_{i,\lambda}(\lambda)} = \frac{\sum_{i = 1}^{N}{{\Phi_{it}\left\lbrack {x_{i}\left( {t - \lambda} \right)} \right\rbrack} \cdot {y(t)}}}{\sum_{i = 1}^{N}{\Phi_{it}\left\lbrack {x_{i}\left( {t - \lambda} \right)} \right\rbrack}}} & (3) \\ {d_{i} = {\underset{\lambda}{argmax}\left\lbrack {{C_{i,\lambda}(\lambda)}_{\max} - {C_{i,\lambda}(\lambda)}_{\min}} \right\rbrack}} & (4) \end{matrix}$ wherein, if the scope of the C_(i,λ)(λ) range is closer to that of y, then the input variable x_(i)(t−λ) is more important; wherein the important degree of each variable is determined by sorting the coverage of C_(i,λ)(λ); wherein the optimal time delay variable x_(i)(t−d_(i)) is obtained; step 4: using obtained x_(i)(t−d_(i)) in step 3 to form a delay input set X_(d)(t)=[x₁(t−d₁), x₂(t−d₂), . . . , x_(m)(t−d_(m))]^(T), and reconstructing soft sensor training sample set {X_(d)(t),y(t)}; wherein if there is a new input sample X(t+1) available, then the delay input set could be restructured based on historical database samples with the same parameters, then go to step 5, otherwise, wait for the arrival of new data; step 5: processing the training set and the new data by j order time difference treatment, wherein the value of j is determinable according to a sampling period and property of dominant variable; wherein a formula for time difference step is: ΔX _(d,j)(t)=X _(d)(t)−X _(d)(t−j) Δy _(j)(t)=y(t)−y(t−j)  (5) establishing a Gaussian process model between differential input samples and output samples; wherein a Gaussian process regression algorithm is presented as follows: wherein given training sample sets X∈R^(m×N) and y∈R^(N), m is the dimension of input data points, N is the number of samples, and the relationship between input sample x_(i)∈′″ and output sample y_(i)∈R satisfies: y _(i) =f(x _(i))+ε ε˜N(0,σ_(n) ²)  (6) wherein in formula (6), f is an unknown function, ε is Gaussian noise with zero mean and σ_(n) ² variance; for a new input sample, its corresponding probability prediction output also follows Gaussian distribution and the joint Gaussian distribution N(f_(gp) , cov(f_(gp))) is described as below: f _(gp) |X,y,x _(*) ˜N( f _(gp) ,cov(f _(gp))) s.i. f _(gp) =k(x _(*) ,X)[K(X,X)+σ_(n) ² I _(n)]⁻¹ y cov(f _(gp))=k(x _(*) ,x _(*))−k(X,x _(*))^(T) [K(X,X)+σ_(n) ₂ I _(n)]⁻¹ ·k(X,x _(*))  (7) wherein K(X, X) is a n-dimensional covariance matrix of training samples; wherein k(x_(*),X) is a covariance vector between test sample and training samples; wherein k(x_(*),x_(*)) is the autocovariance of test sample; wherein a Gaussian covariance function is: $\begin{matrix} {{k\left( {x_{p},x_{q}} \right)} = {v\; {\exp \left\lbrack {{- \frac{1}{2}}{\sum\limits_{d = 1}^{m}\; {\pi_{d}\left( {x_{p}^{d} - x_{q}^{d}} \right)}^{2}}} \right\rbrack}}} & (8) \end{matrix}$ wherein, in GPR algorithm, hyper-parameter Θ_(gp)=(v, π₁, . . . , π_(D), σ_(n) ²) is obtained via maximum likelihood estimation approach; wherein the likelihood function is derived as: $\begin{matrix} {{L\left( \Theta_{gp} \right)} = {{{- \frac{1}{2}}{y^{T}\left\lbrack {{K\left( {X,X} \right)} + {\sigma_{n}^{2}I}} \right\rbrack}^{- 1}y} - {\frac{1}{2}\log \; {\det \left\lbrack {{K\left( {X,X} \right)} + {\sigma_{n}^{2}I}} \right\rbrack}} - {\frac{n}{2}\log \; 2\pi}}} & (9) \end{matrix}$ wherein the hyper-parameter Θ_(gp) is set to be a random value within a reasonable range in the first place; wherein the conjugate gradient algorithm is utilized to obtain the optimized parameter set; wherein after obtaining the optimal hyper-parameter, for the test sample x_(*), formula (7) is used to estimate the output value of GPR model; step 6: after all samples are restructured with delays, when a new input data arrives at time t+1, based on y_(j)(t+1−j), calculating a predictive value y_(j,pred)(t+1) by TDGPR algorithm, wherein the formula is presented as below: ΔX _(d,j)(t+1)=X _(d)(t+1)−X _(d)(t+1−j) Δy _(j,pred)(t+1)=f _(GPR)(ΔX _(d,j)(t+1)) y _(j,pred)(t+1)=y _(j)(t+1−j)+Δy _(j,pred)(t+1)  (10) wherein y_(j,pred)(t+1) in formula (10) is the final dominant variable prediction value of the present invention.
 2. A soft sensor modeling method of fuzzy curve analysis based time difference Gaussian process regression according to claim 1, characterized in that this method extracts information of variable delay from process historical database, reconstructing soft sensor modeling data and correcting causal relationship between input and output data. 